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■\ ■ ABSTRACT 

^-H ' The weak-field approximation is one of tlie simplest models that allows us to relate the observed 

' polarization induced by the Zeeman effect with the magnetic field vector present on the plasma of 

. interest. It is usually applied for diagnosing magnetic fields in the solar and stellar atmospheres. A 
fully Bayesian approach to the inference of magnetic properties in unresolved structures is presented. 

^ ' The analytical expression for the marginal posterior distribution is obtained, from which we can ob- 

^ , tain statistically relevant information about the model parameters. The role of a-priori information 
is discussed and a hierarchical procedure is presented that gives robust results that are almost in- 

' sensitive to the precise election of the prior. The strength of the formalism is demonstrated through 

\ an application to IMaX data. Bayesian methods can optimally exploit data from filter-polarimeters 

■ given the scarcity of spectral information as compared with spectro-polarimeters. The effect of noise 
' and how it degrades our ability to extract information from the Stokes profiles is analyzed in detail. 

. Subject headings: methods: data analysis, statistical — techniques: polarimetric — Sun: photosphere 

^' 

P^, 1. INTRODUCTION 

Q ' Extracting information about the magnetic field vector from spectro-polarimetric observations is not devoid of 
difficulties. The main one is that, practically always, one has to go through a modeling phase. This modeling 
c/3 ■ typically consists of setting an atmospheric model that depends on some parameters which one wants to infer. Such 

■ a procedure, usually known as spectro-polarimetric inversion, has allowed to extract extreme ly valuable informa tion 
about the behavior of the magnetic field in the solar photosphere and chromosphere (see e.g.. lBellot Rubioll2006l and 

, . references therein). 

It is hard to summarize in a few lines the history of spectro-polarimetric inversions from the first steps back in 
the 1970s. The initially proposed models were of low complexity beca use the quality of the observations and the 
computing power did no t allow the use of more elaborate models (e.g., lAuer et all 119771 : iSkumanich fc LitesI 119871 : 

■ iLites fc Skumanichlll990l : iKeller et al.lll990D . Although the assumptions on which these models are based may not be 
I exactly fulfilled in the solar atmosphere, their simplicity allowed to put the cornerstone for quantitative spectropo- 

• , larimetry. In fact, these models are still in use for interpreting high-quality observations from the most advanced 
instruments (Lagg et al. 2004; Orozco Suarez et al. 2007; Borrero ct al. 20 10). Later on, inversion codes based on 
^— ^ I the concept of response functions (Landi De gl'Innocen ti & La ndi Degrinnocentilll977[ ) have facilitated the inversion 
^~~| , of high-qual ity Stokes profiles making it p ossible to infer vertical stratifications of the magne tic properties of the 
. . . atmosphere (|Ruiz Cobo fc del Toro Iniestalll992l: iSocas-Navarro et al.|[2000l: iFrutiger et al.ll20Q0t ). 
J> ] After the enormous success of standard inversion methods based on least-squares optimization, it is time to study 
in depth the inversion p r ocess itself and introduce more powerful techniques. This is what has been done recently 
by lAsensio Ramos et al.l (|2007t ) , who treated the inversion process as a Bayesian probabilistic inference problem. 
These techniques, which allow to fully exploit the infor mation encoded i n the Stokes profiles once a model has been 
proposed to explain them, have been used recently by lAsensio RamosI (|2009l 120101 ) to infer that fields in the qui- 
et est regions of the solar in t ernet work appear to be quasi-isotropically distributed, reinforcing the previous results 
of iMartfnez Gonzalez et all ()2008D . We consider that the Bayesian approach is the best choice, specially in those 
cases in which the spectro-polarimetric signal is at the noise level or the wavelength covering is very sparse (like in 
filter-polarimeters). Additionally, it gives the opportunity to quantitative ly compare different m odels and use different 
models as a committee to g ain insight on a common phy sical parameter (jAsensio RamosI [2OIOI ). 

The approach followed bv lAsensio Ramos et al.l (|20071) based on a Markov Chain Montecarlo sampler is completely 
general so that it can cope with very complex radiative transfer forward problems. The main drawback is tha t it ca, n 
become costly in terms of computational time, although not prohibitive, as already shown bv lAsensio RamosI (|2009t) . 
Obviously, Bayesian inference cannot be compared to standard inversion methods just by the computing cost, because 
the amount of information obtained is much rich er. In order to motivate t he application of Bayesian inference, we 
consider in this paper the assumption of weak field (iLandi Degl'Innocenti fc Lan dolfi 2004) for the inference of magnetic 
fields from the observation of Stokes profiles. This is one of the most straightforward approximations one can consider 
to relate the Stokes profiles and the magnetic field vector. In spite of its simplicity, its range of applicability is very 
broad and it is systematically applied in different fields, from solar to stellar physics, as shown in the next section. As 
we show, the simplicity of the model allows us to obtain analytical expression for some of the posterior distributions of 
the model parameters. We consider that the weak-field approximation is of practical application while simultaneously 
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being simple enough to demonstrate the power of Bayesian inference and show their fundamental points. 

Section 2 describes the model. Section 3 shows the simple non-hierarchical approach and also sets the notation 
used throughout the paper. Section 4 presents the robust hierarchical model for the case in which the noise standard 
deviation is known and when it is inferred from the data. Finally, the method is demonstrated with some examples in 
section 5. 

2. THE ZEEMAN WEAK-FIELD APPROXIMATION 

When the splitting produced in a given spectral line via the Zeeman effect by the presence of a deterministic 
magnetic field (gAAs, with g the effective Lande factor) is smaller than the line broadenin g (AA^i), the line is said to 
be well modeled in the weak-field regime (Landi Dcgl'Innocenti fc Landi Degl'Innocen ti 1973). Writing the expressions 
for AAs and AA/j fe.g.. [Lan d i Dcgl'Innocenti fc Landolfi ,20041 ) . a line is in the weak-field approximation when the 
magnetic field strength fulfills: 

„ 47rmc /2fcr 
gXoeo V M 

where m and cq are the electron mass and charge, respectively, c is the speed of light, k is the Boltzmann constant, 
M is the mass of the species, Aq is the central wavelength of the spectral line under consideration and v-caic is the 
microturbulcnt velocity. For an iron line at Aq = 5000 A, using Umic = 1 km s"-'^ and T = 5800 K, we end up with: 

gB < 2400G. (2) 

In principle, this is more than enough to deal with photospheric magnetic fields outside from active regions in the solar 
atmosphere observed with magnetically sensitive lines in the optical (with g > 1) or even active regions observed in 
lines with a weak magnetic sensitivity. Since the thermal width is enhanced in the chromosphere due to the increased 
temperature and the field strength is known to be smaller, the weak-field approximation is especially interesting for 
inferring magnetic fields at chromospheric heights. 

In this approximation, a straightforward relation between the magnetic properties of the plasma and the emergent 
Stokes parameters exist. In present day observations of the weakly magnetized regions of the solar atmosphere we 
cannot state that we are resolving all magnetic structures. In case that our resolution element is not filled with an 
unidirectional magnetic field vector, we can mimic the loss of signal by assuming that the observed signal in the pixel 
is obtained as the average of a magnetic component with a relative weight / and a non-magnetic component with the 
weight 1 — /. The following equations hold for Stokes V at first order in the field strength and for Stokes Q, U at 
second order in the field strength fe.g.. lLandi Degl'Innocenti fc: Landolfill2004|) : 

Q(A)=/3/i?icos2x^^ 

C/(A)=/3/i?isin2x^^, (3) 

as functions of _B[|, the projection of the magnetic field vector along the line-of-sight (LOS), B±, the component of the 
vector perpendicular to the LOS, x, the field azimuth and /(A), the wavelength variation of the intensity across the 
spectral line. The proportionality constants a and (3 have the values: 

a = -4.67x lO^^gA^, ^ = -5.45 x lO^^eGA*, (4) 

with the wavelength A in A and the components of the magne tic field measured in G. The factor q is the effective 
Lande factor and G is the equivalent for linear polarization (e-g.- lLandi Degl'Innocenti fc Landolfill2004[ ) . Both factors 
measure the sensitivity of the spectral line to the presence of a magnetic field. The previous expressions are only valid 
when B±, the field azimuth x, the line-of-sight velocity, the Doppler width, and any broadening mechanism are 
constant with height in the line formation region. Additionally, the expression for Stokes Q and U can only be applied 
for non-saturated lines. Since the field azimuth is constant with height in the atmosphere, it is possible to define the 
total linear polarization L — {Q^ + C/^)^/^ which can be written as: 



d^i{\) 



(5) 



where the absolute value is a consequence of the definition of L. From this point, we assume that Q(A), U{X) and 
V^(A), the observed wavelength variation of the linear and circular polarization profiles, respectively, can be correctly 
modeled with the aid of Eqs. ([3]). 

In spite of and thanks to its simplicity, the weak-field approximation is broadly applied for the inference of solar and 
stellar magnetic fields from the observation of Stokes p rofiles. A limited se lection include s the detection and diagnostic 
of magnetic fields in: central stars of planetary nebulae (Jordan et al."2005'), white dwarfs (Aznar Cuadrado et al.i|2004D 
pulsating stars (Silvester et al..,2009J . hot subdwarfs (O'Toolc et al...2005i) . Ap and Bp stars (Wade et al..,2000i) and 
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chemically peculiar stars (|Bagnulo et al.l [20021 ) . The least-squares decon volution (LSD) technique that is widely used 
for the detection of magnetic fields in solar-type stars (jPonati et al.lll997D is also fundameii t ally based on the weak-field 
approximation. Many synoptic magnetographs like those of Big Bear ( Spirock et al.|[200ll lVarsik|[l995D are calibrated 
using this formalism or equivalent. The circular polarization observed in chromospheric lines l ike the 10830 A mul- 
tiplet of He I is very well modeled under the weak-field approximation (jMerenda et al.l l2006t lAsensio Ramos et al.l 
[200l). It is even used to produce modern vector mag netograms like thos e obtained with the IMaX instrument 
(|Martfnez Fillet et aLll2011f) onboard the Sunrise balloon (jSolanki et al.ll20100 . 



3. NON-HIERARCHICAL BAYESIAN APPROACH 

The weak-field approximation is a model for the interpretation of obser ved Stokes profiles that depends on the four- 
vector of parameters {f,B\\,Bj_,x)- Under the Bayesian approach (see lAsensio Ramos et al.l 120071 : lAsensio RamosI 
120091 and references therein), all knowledge gained about the parameters when some dataset D is presented to the 
model is encoded on the posterior probability distribution function , i?_L, xjZ)). Using the Bayes theorem, this 

posterior distribution can be written as: 

pU,B\\,B^,x\D) = , (6) 

where p{D\f, , B±, x) is the likelihood distribution that takes into account the infiucnce of data on our knowledge of 
the model parameters and _B|| , B±, x) is the prior distribution that accounts for all information about the param- 
eters known in advance. The term p{D) is the evidence or marginal posterior (the area below the multidimensional 
posterior) that, for parameter inference, is just an unimportant multiplicative constant that we neglect in the following. 
We analyze now the analytical form of all quantities present in Eq. 



3.1. Likelihood 

When Stokes profiles are observed with a spectro-polarimeter or a filter-polarimeter we measure the circular and 
linear polarization at N discrete wavelength points. Consequently, the observables are {{Vi,Qi,Ui), i = 1,...,A^}, 
where Vi, Qt and Ui represent the value of the circular and linear polarization at wavelength A^, respectively. Assuming 
that the observations are corrupted with uncorrelated Gaussian noise with zero mean and variance cr^, the likelihood 
function is given by the following expression: 



p(i?|/,i?l|,i?^,x) = (2^)-3~/V-3A'exp. 



N 



i=l 



+ J2[Q^-^5fBlcos2x 
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U,-pfBlsm2x^\ 



(7) 



where we have assumed that the total likelihood is given by the product of the likelihood for each wavelength point 
and Stokes parameter. Note that we have assumed the same noise variance for Stokes Q, U and V . If this is not the 
case, the likelihood can be modified accordingly and the following calculations are modified accordingly. Note also 
that we have assumed that the uncertainty in the first and second intensity derivatives are negligible. If this is not the 
case, its effect can be introduced by substituting tj^ by the variance of the terms inside the parenthesis, which might 

also include correlations between (y^, Q^, Ui) and ^ or It is important to point out that the likelihood is only 

a Gaussian distribution if we deal with the observables Q{\) and C/(A), and not for L{\). In this case, it would follow 
a Rayleigh distribution. Due to the simplicity of the weak-field model, the exponent of the likelihood can be easily 
factorized so that quantities related to observables are isolated from the model parameters. After some algebra, we 
end up with: 

-(Ai -I- A2fBl + A^fBi - 2^4/B|| - 2(^5 cos 2x + Ae sin2x)/Bi)] , 
where the quantities Ai are the only ones that depend on the observations and are given by: 

A, = {2al)-^ Y.^V? + + Uf). M = {2al)-W E (§) ^ ' ^3 = {2al)-'p' ^ 

i i ^ i 

A, = {2al)-'aY^V.^, A, = (2^2)-!/? ^ Q,|A, A, = {2al)-' ^Y.^,^. (9) 

i i i 



piD\f,B\\,B^,x) - (27r)-3^/V-3^exp 
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More intuition can be gained if we complete the squares in the hkehhood function, so that: 

2 



p{D\f, B|| ,B^,x) = {2nr'''V exp 



X exp 



Ai- 



Ao B 



A, 



A2f 

A\ (As cos 2x + sin 2x) ^ 



A-2 



A. 



A. B 



2 /I5 COS 2x + Aq sin 2x 
A^f 



(10) 



3.2. Priors 

One of the advantages of the Bayesian approach is that ah a-priori information about model parameters is made 
explicit in the formalism. This is made through the probability distribution p{f, B^, x)- For simplicity, we assume 
that the prior distribution factorizes, so that: 



p{f, , B^,x) = p{f)p{B\\)p{B^)p{x)- 



(11) 



Correlations between model parameters will then be obtained from the observed data through the likelihood. We do 
not have any preference for any specific value or the filling factor and the field azimuth, so we set uniform flat priors in 
the interval [0, 1] and [0, 27r], respectively. As a consequence, p{f) = n(/ — 1/2) and p{x) = (27r)~^n((27r)~^(x — tt)), 
where n(a;) is the standard rectangular function. If one wants to carry out the inference assuming / = 1, it is enough 
to set p{f ) = 6{f — 1) and the following formulae are still valid. Concerning the components of the magnetic field, it 
is interesting to use priors that give low preference to very large values. We know that values above a few thousand 
G are not realistic. Instead of using a truncated flat prior, for computational purposes and inspired on physical 
considerations, we use different functional forms trying to be as non-informative as possible while maintaining physical 
constraints. For _B|| we propose a Gaussian distribution with a large variance (t|, while for B± we propose a Rayleigh 

distribution with large variance cr^, a consequence of the assumption of Gaussianity for the two components of the 
magnetic fleld perpendicular to the line-of-sight (LOS). If f7|| = a±, we end up with an isotropic distribution for the 
magnetic fleld vector. The two parameters a\\ and a± are known as hyperparameters because they parametrize the 
priors. The complete prior distribution is given by: 



p(/,i3||,Bi,x)-n(/ 



1/2) J-n ^ 



2tt 



27rcri 



■ exp 




a I B\^ exp I ^ 2 



B\ 
2a' 



(12) 



We note that any other functional form for the prior can be used if they are based on sensible physical intuition, 
although some of the following integrals may require more numerical work for their evaluation. As a rule, if data 
contains sufflcient information about the model parameters, the results should be almost insensitive to the election of 
the prior as long as the prior gives non-negligible probability to the regions of high likelihood. For this reason, it is 
fundamental to compare the posterior with the chosen prior to know if data has added information to the inference 
problem. If they are very similar, the results depend on the specific election of the prior and should, in principle, be 
discarded. 



3.3. Posterior 

The full posterior distribution results from the application of Eq. ([6]) : 



Kf,B|hB±,x|i?) = n(/-i/2)n 



X - 

2tt 



(2vr) 



-(3Af+3)/2 -3JV 



1 



B±^ exp 



2/(^5 cos2x + As sin2x) 




Since in the non-hierarchical model that we are dealing with in this section the normalizing constants are just a scale, 
they can be dropped without effect. They will be however important in the following section when marginalizing out 
the prior parameters. The following rewriting is also convenient for evaluating some of the marginal posterior for the 
components of the magnetic field vector: 



p(/, B|| , B^MD) = n(/ - l/2)n ( ^ ) (27r)-(3^+3)/V-3^-l^S^ exp [-{B, - 2B2/ + Sgf)] , 



(14) 



where 



Bi = Ai 



Bl 



B^, 



2g\ 2al 



B2 = A4B« + (As cos 2x + Ae sin2x)Bi, B3 ^ A2BI + 



(15) 



The joint posterior distribution contains all information about the parameters of the weak-field approximation for a 
set of observed Stokes profiles. Different versions of the posterior can be built if different priors are assigned. 
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3.4. Maximum- a- posteriori and maximum-likelihood solutions 

The maximum a-posteriori (MAP) solution to the problem (the one producing the largest posterior inside the prior 
volume) can be obtained by finding the value of (/, , i3^, x)map that maximizes Eq. (|13p . which is equivalent to 
minimizing — \np{f , B\\, B±,x\D). The MAP solution is, therefore, the one minimizing: 



- lnp(/, B|| , B^, x\D) = - In B^+A,+ 

- 2/(A5Cos2x + ^6sin2x) 




AsfBi - 2A4/B11 



cte. 



(16) 



The solution can be found by solving the following non-linear system of equations: 

A5Bl+A2Bp + A-sBif-- 

2^ 



-Aif + B 



A2f 



-l-+^A^Blf 



2Bj 



1 



2(^5 cos 2x + Ae sin 2x)f - 



= 
= 0. 



4/51^5 sin 2x - 4/51^6 cos 2x = 0. 



(17) 



The last equation is the standard estimation of the field azimuth which, provided that B± 7^ and f ^ 0, results in: 

(18) 



tan2x = 4^, 
^5 



which can then be understood as the maximum-a-posteriori estimation of the field azimuth. Additionally, particular- 
izing to the case of a longitudinal magnetograph (i.e., if we do not measure linear po larization), we find the obvious 
solution = A4/A2, that has been already found bv lMartmez Go nzalez & Bellot R ubio (2009). Note also that the 
standard maximum-likelihood solution (also known as least-squares solution) is obtained following the same scheme 
after setting cr|| — > 00 and a± 00 and dropping the term Ini?^ from Eq. (1171) . 

3.5. Marginal posteriors 

To fully take into account the presence of degeneracies among the model parameters in the Bayesian approach, we 
have to compute the marginal posterior distributions for every parameters individually. The marginal posterior is 
obtained by integrating out all parameters but the one of interest. For instance, the posterior for Bu is computed as: 



PiBi\\D) 



2tt 



dx I df I dB^p{f,B\\,B^,x\D) 



(19) 



This integration will introduce into the probability distribution of a given parameter all possible values of the rest of 
model parameters weigthed by their associated probabilities. The advantage of the weak-field model is that some of 
these integrals can be obtained analytically in closed form. Due to the complexity of the integrals, we have been unable 
to find closed analytical expressions for the one-dimensional marginal distributions. However, we present in App. |B] 
the expressions for the 2-dimensional marginal posteriori p(/, xl-D) and for the 3-dimensional marginal posteriors 
p{f,B\\,B^\D),p{B\\,B^,x\D),pif,B^,x\D) iindp{f,B\\,x\D). 

The shape of the marginal posterior is fundamental to decide whether a parameter is constrained or not by the 
observations. Although the solution to the problem would be to give the full marginal posteriors, it is sometimes 
interesting for presentation purposes to give a summary of the distribution. If there is a clear peak with tails falling 
to zero (like a Gaussian, but can often be asymmetric) , the value of the parameter at the peak (the most probable) 
is the so-called marginal maximum-a-posteriori (MMAP) solution or mode. A confidence interval can be put by just 
integrating the posterior until a given fraction of the total area is obtained. Another possibility to summarize the 
distribution is to give the median value (where the cumulative probability distribution equals 1/2), together with a 
confidence region. Finally, it is possible to compute moments of the quantity x using the standard definition: 



/ x"p{x\D)dx 
Jp{x\D)dx ' 



(20) 



and the confidence interval is obtained like in the MMAP. 



3.6. Marginal posterior of derived quantities 

Since all the information about the model parameters is contained in the posterior probability distribution function, 
information about any derived quantity can be computed from it using the machinery presented in App. [X] 
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3.6.1. Magnetic flux density 

One of the quantities of interest is the magnetic flux density, defined as F|| = JBk . FoUowing the change of variables 
formulae, we can write the probability distribution function of the magnetic flux density as: 



El 

Bn 



D 



1 



\B, 



vdB« 



P{F\\ <0\D). 



D 



\Bu 



rdBn 



(21) 



where the joint distribution p{f, \ D) is obtained by marginalizing B± and x from the full posterior. Since < f < 1, 
\B\\ \ > I has to be fulfilled, which produces the separation of the integral in two different cases depending on the sign 
of i^ii . The integration on _B|| has to be carried out numerically since it is not possible to obtain a closed expression. 
The previous approach is unnecessarily complex for such a simple quantity like the magnetic flux density. In this 
special case, it can be computed following a different path, noting that the posterior for can also be obtained 
directly because: 

dI{X) 



V{X) = aF\\ 



dX 



and neither Q{X) nor U{X) do depend on F\\. In such a case, it is easy to write a posterior for this variable: 

dh 



p{F«\D)=p{F.){27r)-^/'a. 



-N 



exp • 



1 

2^ 



dX 



dX 



(22) 



(23) 



The only remaining ingredient is the prior distribution p{F\\). Since Fu = and the priors for / and have been 
discussed in Sec. 13.21 the prior for F\\ can be obtained following App. |A1 thus resulting in: 



2a 



(24) 



where Ei{x) is the flrst exponential integral fe.g.. lAbramowitz fc Stegunll 19721) . When data is sufficiently informative 
so that the prior distribution becomes unimportant, the posterior for the magnetic flux density is Gaussian with mean 
and variance given by (Martinez Gonzalez fc Bellot Ru bio 2009): 



^^(Fn 



(25) 



3.6.2. Magnetic field inclination 



A similar approach can be followed to compute the marginal posterior for the field inclination from the joint distri- 
bution of and B±. To this end, we apply again the rules of App. |X]to the change of variables 6 = arctan(i?j^/i3||) 
and obtain: 

/•OO 

p{e>0\D) = \l + t&n^e\ / dS||p(B||,Bi = B|| tan6'|D)|S||| 

^(6* < 0|i:')==|l + tan2 6l| / dB^\p{B\^, Bj_ = B\^taTi0\D)\B^\\ (26) 

J —OO 

where the joint distribution is computed by marginalizing / and x from the posterior and the limits of integration 
have been adapted to fulfill B± = _B|| tan{9) > 0. 

Using the same procedure, the prior distribution for the inclination can be obtained by plugging Eq. (|12|) into Eq. 
(|26|) . The results is, after some algebra, that p{9) — \ sin(?|, an obvious result because the prior we are using assumes 
that the field is isotropically distributed. 



3.6.3. Magnetic field strength and energy 
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Exactly the same procedure can be applied to obtain the posterior for the magnetic field strength, B = 
and to the magnetic field energy, E'mag = B'^/Sn, obtaining: 



/Bf.+Bl, 



p{B\D)= f dSii 

J -B 



B 



B^-Bj 



P 



(^11 V 



D 



dB» 



87r£',nae — B] 



p{B^^,J8ttE, 



~ B"^ 

^mag -'--'II 



D 



(27) 



where the joint probability distribution is computed by marginalizing / and x from the posterior. 

4. HIERARCHICAL BAYESIAN APPROACH 

In principle, the values of a\\ and a± in the previous formalism are understood to be fixed and given a-priori. One of 
the disadvantages of this approach is that, if there is not sufficient information about the model parameters encoded in 
the observations, the results will surely be sensitive to the election of these numbers (see [j5]for an example). However, 
we can take advantage of the fact that, in the Bayesian formalism, every unknown can be considered a random variable. 
We can put an appropriate prior on them and extend the formalism in a hierarchical way and let data determine their 
values. Since the priors are put over hyperparameters (parameters of the prior distributions), they are known as 
hyperpriors. In principle, nothing avoids us to make hyperpriors depend on a dditional hyperparameters over wh ich 
appropriate priors are assigned and continue the hierarchical structure (see, e.g., lGregorvll2Q05l : lGelman et al.ll2003l for 
more details on hierarchical models). The immediate effect of the hierarchical approach is that results are much less 
dependent on the specific choice of hyperparameters and we make the problem essentially free of parameters. In some 
sense, this happens because we allow priors adapt to data. It can be demonstrated that some standard regularization 
schemes applied to least-squares techniques are indeed particular cases of hierarchical models. 

4.1. Known noise variance 

Using trivial probability calculus, the full posterior can be expressed by marginalizing out the hyperparameters, 
thus: 

P(/,B||,B^,X|Z?) = j da||daxp(Z?|/,S||,B^,x)p(/,B||,Si,xk||,ax)p(a||,ai), (28) 

where we have used the fact that the likelihood does not depend on the parameters (T|| and a±. Note that this 
marginalization takes into account all possible values of the hyperparameters weighted by their probability. The term 
j)(/, i?^|(T||, fT^) is the prior defined in Eq. \12\ and the new distribution p{a\\,a±^ gives prior information about 
the hyperparameters. Since they behave as scale parameters (they can take values spanning several decades), a good 
option is to use a uniform prior on logarithmic scale, so that: 



p{a 



1 

(T|| <T 



1 



(29) 



This prior, also known as Jeffreys' prior (jJeffrevslll961l : lMacKavll2003[) . has the remarkable property of being the only 
non-informative prior for scale-invariant quantities. This complicates the analytical solution of the problem requiring 
more purely numerical integrations but also introduces additional stabilization. Although the Jeffreys' prior is improper 
(its integral is not finite), the integral of the priors over a\\ and cj^ are finite but also improper. The integral of Eq. 
can be finally carried out for all possible values of (T|| and <t^ to give: 



p(/, B«,B^,x\D) - p{D\f, Bn,B^)n{f - l/2)-^n 



47r 



X - TT 

2tt 



1 



\B\\\B^-' 



(30) 



where the likelihood p{D\f^ B\\, B±) is given by Eq. ([S]). The marginalization process demonstrates that the Jeffreys' 
prior is a natural election for _B|| and B±. In other words, we could have started our calculation by selecting a Jeffreys' 
prior in Eq. (|12[) without considering the hierarchical approach. 

In case no information is present in the data (flat likelihood), the prior for can be obtained by substitution of 
the previous expression in Eq. (|2ip : 

Pm-J^y (31) 

so the posterior is given by Eq. (^5)) . 

It is also possible to introduce new free hyperparameters cr^in and (Tmax that define a lower and upper estimation 
of the width of the prior distributions, respectively. Although they constitute a new set of free parameters, the 
hierarchical character offers the advantage that results are much less sensitive to their exact values. Because we do not 
expect magnetic fields in the solar atmosphere larger than ~ 4000 G or we do not expect to detect fields below ~ 0.1 
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G, it is sensible to choose a^i^ = 0.1 G and tTmax = 4000 G. These values can be adapted for specific cases (inverting 
Stokes profiles observed in active regions or the quiet Sun) but the results are really insensitive to the specific values, 
as shown in Sj5j The following prior is the resulting one after marginalizing out a\\ and crx: 



p(/,B|i,Bi,x)-n(/-i/2)^n 



exp 



47r 



1 



2-K J B\\Bj_ 
f B 

- exp 



erf 



— erf 



(32) 



which converges to Eg. ([5 01) when Cniin = and a^ax — ^ oo. The function erf(x) is the error function (e.g., 
lAbramowitz fc Stegunl[r972h . This yields the following general expression for the posterior that we use when the 
noise variance is known: 

p(/, , Bx, xl^) = ^n(/ - i/2)n (27r)-(3^+i)/V-3^ 



X exp 
1 

X 



+ A2rBl + A^tB\ - 2A4/B11 - 2(A cos 2x + sin 2x)fB\ 



B\\B± 



erf 



erf 



f 5|l 



exp 



B'i 



2a2 



exp 



B'' 



2ai 



(33) 



Again, we have been unable to obtain the l-dimensional marginal distributions, but App. |B]presents the 3-dimensional 
marginal posteriors p(/,i3||,i3_L|D) andp{B\\,B±,x\D). In this generalized case, under non-informative data, the prior 



for results in: 



erf 



Fn 



erf 



\/2c7„ 



1 



-El 



1 



-El 



' ^11 ' 

2'''max . 



which reduces to Eq. ([31]) when CTmin = and an 



inin / V27ra-max 
00. Substitution in Eq. (|23|) gives the posterior for Fu 



(34) 



4.2. Unknown noise variance 

Another step forward in the hierarchical model can be given if the noise standard deviation is not known with 
precision. Sometimes the peculiarities of the observation induce that estimating the observational uncertainty is not 
an easy task. It is typically estimated assuming ergodicity and calculating the variance of the signal on a continuum 
window, where the polarimetric signal is assumed to be zero. This might be flawed if, for instance, the polarimetric 
signal is very broad and the assumption of zero signal in the continuum window is not correct. Likewise, when there is 
a large velocity field producing a large Doppler shift, neighboring spectral lines can enter into the continuum window 
and give wrong estimation of the observational uncertainty. This is especially relevant when when observing with filter- 
polarimeters, where the information of the continuum is usually contained in one or two spectral samples. In such a 
case, it is possible to consider (t„ as a random variable which is marginalized at the end. Although we assume that 
the variance of the observational uncertainty is unknown, we postulate that the noise follows a Gaussian distribution. 
This distribution, in the absence of detailed information about the correct distribution apart from the finite variance, 
is the one suggested by the principle of maximum entropy. Therefore: 



P{f, Bi\ , B^,x\D) = J dcr||dcr^do-„ p{D\f, ,B^,x, <^n)pif, Bii , B^,xW\\ , (^J-)pi<^\\ , CT±)p(cr„). 



(35) 



Using a Jeffreys' prior over tT„, so that p((T„) = cr,! ^, integrating over the full domain in (t„, f7|| and we obtain the 
following posterior: 



p(/,B„B.,x|i5)^in(/-i/2)n(l^)^ 



(^2T^y{3N-\-l)/2fj-3N/2^ 



where 



C = A[+ A'J^Bl + A'.,fB\_ - 2A'JBu - 2iA', cos 2x + A', sin2x)/i?i, 



(36) 



(37) 



the A[ are obtained from the Ai defined in Eq. ([9]) but dropping the terms (2(7^J~^ and N is the number of wavelength 
points in the profiles. If we limit the integration to the range [cn^^j^^, cr„,j,^^] because we know that noise cannot be too 
small or too large, but keep the integration on the full space for (T|| and a± we end up with the following posterior: 



p{f,Bu,B^,x\D)^-U{f -1/2)U 



X 



27r 

m c 

~'2^ 



\B\^B^ 



(27r)-(3W+i)/2 

3iV C 
~'2^ 



(38) 
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which reduc es to Eg. (1361) when dn ,„ — » and <Jn-^^^ — ^ oo due to the properties of the incomplete Gamma function 
r(a, x) (e.g.. I Abramowitz fc Stegunin 9 721) . It is also possible to limit the integration volume in cr„, a\\ and crj^, arriving 
at the most general hierarchical model we consider that can be applied when the noise variance is unknown: 



p(/,B|i,Bx,xl^)-7Tn(/-i/2)n 



27r 



1 



erf 



V2o 



3iV C 



erf 



V2o 



exp 



2 ' 



3A^ C 



2 '2^2 



2^2 

^ max 



exp 



i?2 



2<i„ 



(39) 



Concerning the posterior for the magnetic flux density, marginalizing Eq. 
find: 



2 ' 2a2 



over a„ with the Jeffreys' prior, we 

(40) 



where 



9A 



2F||^al/, 



2'2< 
dl. 



(41) 



The prior distribution p{Fh) can be that of Eq. (PT|) or the most general one of Eq. ([M)) . 



5. ILLUSTRATIVE EXAMPLES 



5.1. 



Observed profile 

In order to illustrate the behavior of the non-hierarchical and hierarchical models that we have developed, let us 
consider extracting informati on under the weak-field app r oximation from the wavelen gth variation of a s elected Stokes 
profil e observed with IMaX (jMartmez Fillet et al] 120041: iMartmez Fillet et al.ll2011[ ) onboard Sunrise (jSolanki et al.l 
I2OIOI) in a quiet region of the solar atmosphere. The observed profile is shown in Fig. [TJ The reason for choosing 
this instrument is to show the power of Bayesian methods to deal with cases in which there is reduced information 
in the observables. The number of sampled points is just 5 which makes the typical shape of the polarization profiles 
hardly indistinguishable. In order to apply the previous formulation we need to have an estimation of the first and 
second derivative of the Stokes / profile. The poor spectral sampling makes this operation delicate. However, we have 
decided to apply a numerical derivative based on a standard 3-point Lagrange interpolation of the profile. This could 
have been improved by, for instance, fitting the intensity profile to a Gaussian and using this fitted profile to carry out 
the derivative. We have verified that differences are negligible with both approaches. 

Figure [5] shows the joint posterior marginal distributions p(i?|| , B±\D) in the first column, p{f, \D) in the second 
one and p{f,B±\D) in the third. These marginal distributions help us distinguish which parameters are degenerate 
and help us understand the behavior of the full posterior. For comparison, the first two rows (labeled a) present cases 
obtained by marginalization of t he n on-hierarchical posterior of Eq. ()13p . Specifically, these panels were obtained 
by numerical integration of Eq. (jBip . The middle two rows (labeled 6) are obtained marginalizing the hierarchical 
posterior of Eq. (1331) assuming Gaussian noise with a standard deviation of ct„ = 2 x 10^^ in units of the continuum 
intensity. Specifically, they are computed by numerical integration of Eq. (jB7|) . Finally, the last two rows (labeled 
c) are obtained applying a numerical quadrature to the full hierarchical posterior of Eq. p9p that marginalizes the 
noise. Two typical values of the hyperparameters are considered in each one of the posteriors. The values of the 
relevant parameters are indicated in each plot and in the caption. As a general rule, t he non-hierarchical m odel has 
a stronger sensitivity to the prior hyperparameters than the hierarchical models (e.g.. iGelman et aD 120031 ). In our 
case, the hierarchical models are almost insensitive to the exact values of the hyperparameters, even though we have 
used two extreme values. The joint marginal posteriors clearly indicate the presence of strong degeneracies between 
the / and B\\ and B±, as widely known. However, there is still some information available in the observations about 
each parameter individually, at least sufficient to discard regions of the space of parameters. Although part of this 
information is encoded in the priors, it is transparently introduced and can be controlled at will. Furthermore, this 
information is based on very general considerations about the behavior of the magnetic field (like typical maximum 
values present in the solar atmosphere). 

When extracting information for a given parameter, it is necessary to marginalize out the rest of parameters. This 
marginalization carries out error propagation correctly and degeneracies are evidenced as long tails in the marginal 
posteriors. Figure [3] presents these posteriors. The first column corresponds to the non-hierarchical model, while the 
second and third columns are the results of the two hierarchical models. They were obtained numerically by integrating 
the distributions shown in Fig. [2] The first row is the marginal posterior for /, the second for _B||, the third for B± 
and the fourth for i^|| . The dashed lines are the priors considered for each parameter. In the non-hierarchical model, 
the MMAP value for B\\ seems to be relatively robust to the election of the prior, while it has a larger dependence on 
the prior for B± and /. According to the shapes of the marginal posterior distributions, it is possible to give the most 
probable value together with asymmetric error bars which contain 68% or 95% of the total mass of the distribution. 
In this particular example, the lower limit of _B|| will be much larger in magnitude than the upper limit. The reason 
is that the sign of controls the sign of the Stokes V profile, and positive fields are strongly discarded by data 
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because the polarity is not the correct one. The same apphes to / and B±, which have a very asymmetric confidence 
interval Interestingly, since F|| is a quantity directly proportional to the amplitude of Stokes V in the weak-field 
regime, its marginal posterior distribution is very well determined and its width is fundamentally characterized by the 
noise standard deviation. 

Given the sensitivity to the election of the prior of the non-hierarchical model, we also consider the two hierarchical 
models. The results are very robust to the election of the hyperparameters (the red and black curves overlap). The 
filling factor is not constrained at all by the observations, as expected in the weak-field regime. In spite of that, the 
posterior for the component of the field along the line-of-sight is constrained and one can define a MMAP value and a 
confidence interval. In any case, the analysis discards positive values and very large negative values of . Concerning 
B±, the posterior is very similar to the prior. Since the prior is based on physical considerations, we can safely give 
an upper limit to B±, but always taking into account that this comes essentially from a-priori knowledge. 

The posterior for F|| is again very close to a Gaussian with mean and variance given by Eq. (|25|) in the case of the 
hierarchical model with known noise. Note also that there is an additional peak at zero magnetic flux density. This 
peak is produced by the marginalization over the priors, which opens the possibility that the observed Stokes V signal 
is compatible with the absence of magnetic field. Concerning the hierarchical model with unknown noise standard 
deviation, the posterior is slightly narrower, induced by the Jeffreys' prior. 

Finally, Fig. |4] presents the marginal posteriors for the derived quantities 9, B and i?mag, that were obtained 
numerically by integrating the distributions shown in Fig. [2] using the appropriate change of variables. The results 
indicate that we can put upper limits to the field strength and to the magnetic energy, and that data contains some 
information on these two variables, with posterior clearly different from the priors. Concerning the field inclination, 
the marginal posterior looks like the prior distribution, so that we can only discard positive inclinations because they 
are not compatible with the Stokes V profile polarity. 

5.2. Degradation of information. The effect of noise 

There are two important ingredients that differentiate a standard maximum-likelihood (or least-squares) approach 
and a Bayesian approach. The first one is that the solution to the inference problem is given in terms of marginal 
posterior distributions that already encode error propagation. The second is that, thanks to the effect of the priors, 
when noise is too large, the marginal posteriors resemble the prior. This way, one can clearly state when a parameter 
is constrained by data and avoid biased estimators. To show this effect, we have analyzed how the infiuence of noise 
modifies the marginal posteriors. We consider the Stokes profiles of the Fe i line at 6302.4904 A synthesized using the 
Milne-Eddington approximation with B = 400 G, 9 = 30° and x = 20°. This line has a large magnetic sensitivity, with 
g = 2.5 and G — 6.25. Considering f — 1, the amplitude of Stokes V is of the order of 2% in units of the continuum 
intensity, /c, while that of Stokes Q and V is of the order of 0.03% in the same units. We corrupt these profiles with 
Gaussian noise with zero mean and standard deviations cr„ = {5 x 10"'', 10"'^, 5 x 10~^, 10~^} in units Ic- The case 
with the smallest cr„ results in a noise amplitude of roughly the same amplitude of the linear polarization signal. We 
use the hierarchical model with known noise variance of Eq. p3|) and numerically calculate the marginal posteriors. 
The results are shown in Fig. [5] for all physical parameters of relevance. Each color corresponds to a different value of 
cr„. The vertical dotted lines indicate the maximum-likelihood values that maximize Eq. (|5]). The curves are slightly 
dependent on the actual noise realization because the Ai coefficients change. We plot the results for only one noise 
realization. 

As a consequence of the selected values for the magnetic field vector and /, the original synthetic profile corresponds 
to = 346.4 G and B± = 200 G. The marginal posterior for indicates that the MMAP value is compatible with 
the correct value only when the noise is not too large so that the Stokes V signal is not too perturbed. As soon as 
the noise increases, there is a shift towards smaller B^^ . An interesting property of the solution is that the confidence 
intervals are extremely asymmetric, with small values of absolutely discarded and a very large upper limit. This 
is produced, among other things, by the marginalization of the filling factor. Something similar happens for the field 
strength. 

Concerning B±, the marginal posterior is different from the prior only for cr„ = 5 x 10~^, with a bump roughly close 
to the correct value although displaced towards smaller values. As soon as the noise increases, there is no remaining 
information about this component of the field on the profiles and one can only put an upper limit. A consequence 
of this is the fact that the marginal posterior for the azimuth is essentially flat. Some peaks of larger probability are 
located close to the correct value but without statistical relevance. Since the field inclination depends on _B|| and B±, 
the marginal posterior for the inclination indicates that a reliable inclination is only inferred when noise is sufficiently 
small. For large noises, only upper limits can be correctly defined. Finally, the marginal posteriors for the magnetic 
flux density are Gaussian with a variance proportional to the noise standard deviation (in G), as shown in Ec^. (|25p . 

6. CONCLUSIONS 

We have considered the complete Bayesian inference of the parameters of a model based on the weak-field approx- 
imation to explain observed Stokes profilcfQ. The simplicity of the approximation has allowed us to obtain a closed 
analytical expression for the posterior distribution function and for some of the ensuing marginal posteriors. Thanks to 
the Bayesian approach, prior information is transparently introduced into the problem. We have verified that results 
are sensitive to the hyperparameters of the prior and we have developed a hierarchical approach based on physical 

^ Computer programs that calculate all the quantities presented in this paper can be freely obtained from 
http : //www. iac . es/project/magnetism. 
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arguments that introduce regularization into the problem. As a consequence, marginal posteriors are almost insensi- 
tive to the exact value of the hyperparameters. Using the Bayesian approach we are able to extract not only most 
probable values but also confidence intervals for the model parameters. This Bayesian approach can be of interest 
for filter-polarimetric data in which the line profiles are only sampled in a reduced number of wavelengths. Likewise, 
signals very close to the noise can be treated under this formalism avoiding biased estimations that plague least-squares 
solutions. It is left for the future to carry out a profound analysis of the biases introduced by least-squares solutions 
using the expressions introduced in this paper. 
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in Section [5] Financial support by the Spanish Ministry of Science and Innovation through projects AYA2010-18029 
(Solar Magnetism and Astrophysical Spectropolarimetry) is gratefully acknowledged. 



APPENDIX 
CHANGE OF VARIABLES 



Given two random variables x and y with joint distribution p{x, y), the probability distribution q{'w) for the derived 
quantity w — f{x, y) with f[x, y) invertible can be obtained following the standard chain of variables rule. If we define 
the auxiliar variable z = g(x, y), we can write the following direct and inverse relations: 



w^f{x,y), x = ^{z,w) 
z = g{x,y), y = '^{z,w). 

As a consequence, the joint probability distribution of z and w can be written as: 

q{z,w) ^p{^{z,w),j{z,w))\J\, 

where J is the Jacobian of the transformation equation given by Eq. (jAip : 



J 



d^{z,w) d$,{z,w) 
dz dw 



1 

d'y{z,'w) d'y(z^w) 
dz dw 



dw 



(Al) 
(A2) 

(A3) 



were the last two steps assumes ^(z, w) = z to simplify computations. The probability distribution q{w) is obtained 
by marginalizing z from Eq. (jA2l) : 

dj{z, w) 



q{w) 



dw 



dzp{z,-/{z,w)) 

For instance, if we want to carry out the transformation w ~ f{x, y) = xy, then j{z, w) = w/z, so that 

q{w) = J dzp (^z, i 
MARGINAL POSTERIORS 

This appendix presents some of the analytical posteriors that can be obtained from Eqs. (|13|) and ([33 

Non-hierarchical model 

If we integrate the field azimuth from the posterior of the non-hierarchical model, we obtain: 



(A4) 



(A5) 



1 



p(/,B||,i3^|Z?)=n(/-l/2)n(^) (2,r)-(3A'+i)/2^-3^_^5^j„ ( 2fBl^/Al+Al 



X exp ■ 



V 2tt 



Al + I A,f + -L ) + AsfBi - 2A^fBn + ^ 



2^2 



(Bl) 



where /o(a;) is the modified Bessel function of the first kind ([Abramowitz fc Stegunl[T972[ ). It might be useful to apply 
the following series expansion of Jo (a:) to further integrate out the filling factor: 



«.^)^E(^)'(r-44 

fc=0 ^ ^ 



(B2) 
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Likewise, marginalizing B\\ and B^, we obtain: 



p{f, x\D) = in(/ - i/2)n f ^\ (2^)-(3A'+i)/2,-3^^ 



27r 



/ 



3((2a|)-i+yl2/2) 



1/2 



X exp 



-^1 



1 + erf 



((2ai)-i - 2(^5 cos 2x + sin 2x)/)' 
-(2gi)-i + 2(A cos 2x + As sin2x)/ 



(2a2)-i+A2/2 



(B3) 



Concerning the marginal posterior for the field components, we have been unable to find a close analytical expression 
for the probability distributions p{B\\\D) and p{Bi_\D) because B\\ and B±^ appear very intricately. However, the 
marginal posteriors can be obtained using adequate numerical quadrature methods on any of the following joint 
posteriors: 



1 



p(i?,„B.,x|i^)_n^^j (2. 



X-TT 



-(3Ar+2)/2 -3iV_ 



1 



-B^{A2Bl+A^Bi)-'/^ 



Bl B^. (A4B11 + (A5 cos 2x + As sin 2x)B 



2 \2 



^42^2 + A^Bl 



erf 



+ (As cos 2x + A sin 2x)B'] 



A2BI + A^B\ 



erf 



p(/,i?i,x|i^) = ^n(/-i/2)n 

V2 V 27r 



+ {A5 cos 2x + Aq sin 2x)Bi - ^2^2 - AgSi 
A2BI + A^Bi 

(2vr) 



1 „ . /X \ ^r,_\-(3W+2)/2 -3Ar_ ^ 



,,/(2a(f)-i+A2/2 



X exp • 



(2^2)-l+A2/2 



2 (A cos2x + As sin2x)/ 



2^2 



2 d4 



Bl - As/'B 



B,„ x|B) ^ ^n(/ - i/2)n (^) (2.)-(3-)/2,- 



1 1 



X exp 



A,Bf, 



(B4) 



(B5) 



(4ai)-2 - (As cos2x + Ag sin2x)(2ai)-i - {A^A^ + Al - + 2A^f ~ A^B^^f 



A3P 



/ -(2g2 )-i + 2(^5 cos2x + A6sin2x)/ \ 



(B6) 



The previous expressions can be particularized to the specific case of a longitudinal magnetograph, in which only 
circular polarization is observed. We obtain such a case making A3 ^ A^ ~ Aq = 0, dropping Qi and Ui from Ai and 
making cr^ — >■ 00. 

Hierarchical model 

Starting from Eq. p3l) . we have integrated out the field azimuth, yielding: 



p(/,i?lj,i?^|B) = l(27r 

X exp 
1 



-(3Af-l)/2 -3Af 



a-3^^n(/-l/2) 



Al + A2PBI + A^pBi - 2A4/B1 



BnB±_ 



erf 



V\/2crmii: 



erf 



V2an 



h[2fBi^Al+Al^ 

B^ \ f B^ 



exp 



2rr2 



exp 



(B7) 
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Figure 1. Stokes profiles observed with IMaX and used in the examples. The profiles consist of 5 wavelength points that have been obtained 
with a Fabry-Perot filter-polarimeter. For a better visualization, we have repeated the continuum point at 5250.45 A symmetrically on the 
red wing, and connected it with dashed lines. We have also marked the uncertainty (estimated standard deviation of the noise) in e3x:h 

observed point with error bars. 

Finally, it is possible to analytically integrate out /: 



1 



K5||,B^,xP) = ^(27r)-3^/V-3^n 2^ 



X - 



X exp 



-A, 



{A2BI + AsBi)-^^ 

+ (.1.-, cos 2a + -1g sin2A )I?i)2' 



AiB\\ + {A5 cos 2x + Ae sin 2x)Bj 
+ AsBi 



erf 



AiBii + (As cos 2x + Ae sin 2x)Bl - (AaS^ + A3BI) 



A2Bf^ + A^Bi 



B\\B\ 



Bn 



V2a„ 



erf 



VV2an 



exp 



B^, 



2o-2 

max 



exp 



B-j 



2ai 



(B8) 
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Figure 2. Joint marginal posterior distributions for /, By and B^. The top two rows, labeled with a), correspond to the non-hierarchical 
case with two different values of cr|| and a^. The third and fourth rows, labeled with b), show results with the hierarchical approach of Eq. 
I|33| l for two values of CTmax while the last two rows, labeled with c), correspond to the results of the hierarchical model of Eq. II36II for fixed 
values of fn„i„ = 10"* and frimax = 10"'^ and the two same values for cr^i^ and (Tmax considered in the previous case. For clarity, the 
contours at normalized probability 0.01, 0.1, 0.3, 0.5, 0.68, 0.95 are marked in black, red, blue, green, magenta and orange, respectively. 
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Figure 3. One-dimensional marginal posteriors for / (first row), By (second row), B± (tfiird row) and (fourth row). The left column 
shows the non-hierarchical case, while the second and third columns present results in the two considered hierarchical models. The colors 
are associated with different values of the hyperparameters. Note the robustness of the hierarchical models to the different values of the 
hyperparameters. The dashed lines indicate the corresponding prior distribution. If the posterior of a given parameter is clearly different 
from the prior, we can state that the data contain enough constraining information for this parameter. 



a 




Figure 4. Posterior for derived quantities for the profiles of Fig. fusing the hierarchical model with dn — 2 X 10 , (ymin — 

0.1 G and 

o"max = 4000 G. The left panel shows the marginal posterior for the field inclination, 6, in solid line, while the prior is shown in dashed line. 
The middle panel presents the marginal posterior for the field strength, while the right panel shows the same for the magnetic energy. 
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Figure 5. Marginal posteriors for all physical parameters of relevance of a synthetic Stokes profiles with increasingly higher noise levels. 
The different colors correspond to different noise levels. The vertical dotted lines indicate the maximum-likelihood solution. 
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